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(57) ABSTRACT 

A hardware semaphore is one bit wide. A first hardware 
circuit detects one of the processes is writing a new value to 
the semaphore and forces the hardware semaphore to the 
new value written, A plurality of second hardware circuits 
are provided. Each second hardware circuit is associated 
with a separate one of the plurality of processes. Each of the 
particular second hardware circuit includes a detecting cir- 
cuit that detects the processor with which the particular 
second hardware circuit is associated is attempting to write 
the new value to the semaphore. A circuit responsive to the 
detecting circuit provides the current value of the 
semaphore, before the write, to an output of the second 
particular hardware circuit. 
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CIRCUIT THAT IMPLEMENTS circuitry associated with the process. If the read value of the 

SEMAPHORES IN A MULTIPROCESSOR Ibst.i bit indicates that the Test.i bit was asserted before the 

ENVIRONMENT WITHOUT RELIANCE ON write, then this indicates that the attempted "lock" of the 

ATOMIC TEST AND SET OPERATIONS OF semaphore by the process failed. Put another way, if the 

THE PROCESSOR CORES 5 Test.i bit was asserted before the write, the semaphore is 

presently being controlled by another process. Similarly, if 

TECHNICAL FIELD the Test.i bit was not asserted before the write, the sema- 

™. . . . , . , •* *u * • I * phore is presently not being controlled by another process 

Hic piesent inventiOD relates to a cirai.t that mplements ^^^^ ^^^a^ by the setting prowss). 

semaphores m a multiprocessor environment and, iQ jq & r / 

particular, to such a circuit that does not rely on atomic test BRIEF DESCRIFTIGN OF THE FIGURES 
and set operations of the processor cores. 

FIG. 1 illustrates, in block form, a semaphore circuit. 

BACKGROUND pjQ 2 illustrates a process for using the semaphore 

The use of single bit semaphores for signalling between 15 circuit, 

computer implemented processes is well known. In general, FIG. 3 illustrates a set of semaphore circuits, like the 

a single bit semaphore is used in a scheme to prevent semaphore circuit of FIG. 1. 

multiple processes from simultaneously accessing a single piG. 4 illustrates a count up process that uses a set of the 

resource of a system. It is important that a single bit semaphore circuit of FIG. 1. 

semaphore be accessible only in an "atomic" manner. That 20 piG. 5 illustrates a count down process that uses a set of 

is, it must be guaranteed that a semaphore cannot be the semaphore circuit of FIG. 1. 

modified by a second process between the time that a first ^^t^ ^ ^ . . - . . . 

process reads the semaphore and potentiaUy modifies the , ^ ^^^^^^^^ ^ f ^ semaphore operauon 

semaphore. for use by a two-processor system. 

^ ' . , 1. • f . • i. -yc FIG. 7 illustrates the CREnSS0-CREnSS3 register for- 

One conventional mechanism for guaranteemg such 25 ^"^-^ 

exclusivity of access includes the use of a "lock" signal. For ^ ^ ^* 

example, the x86 family of processors provide a lock FIG- « illustrates the CREnSC0-CREnSC3 register for- 

instruction that provides a serialization mechanism such that (FIG. 6). 

a resource (as represented by the semaphore) is restricted for FIG. 9 illustrates the CREnST0-CREnST3 register for- 

use by the holder of the lock (i.e., the process that executed 30 mats (FIG. 6). 

the lock instruction). In particular, executing a lock instruc- piQ. 10 illustrates the CREnSV0-CREnSV3 register for- 

tion causes a lock signal to be asserted. While the lock signal mats (FIG. 6). 
is asserted, no other process can access the semaphore. This 

exclusion is guaranteed by additional circuitry that stalls DETAILED DESCRIPTION 

simultaneous access to the memory that holds the semaphore 35 * .« * * • i_i 1 * u 

between the re«) and the write of the test and set operation. . If'^'"''^: ^lock foim, a semaphore ci«»it 100 

, , , , in accordance with an embodiment of the invention. A 

While some processor architectures include the lock hardware semaphore 102 holds the vahie of a one4)it wide 

signal, other processois do not. With such processors, a semaphore, where the vahie held is provided at either the 

similar mechanism may be implemented using a general ..^j^^^ j j „f hardware semaphore 102. 

purple I/O signal assigned to generate the lock signal lt is ^ g^, ^^^^^^ j^^j^^j ^ g^, ^^^^^ 

also known to implement a lock mechamsm usmg a dedi- ^ t\n ^ ino j ♦ r .u » *u 

.jLj. • j-i ji_ri -11 on . KT a second OR device 108, detects that one of the processors 

catedhardwarecircuit, as disclosed by Drorm U.S. Pat. No. i * ♦u u j u *m i 

5216 886 ^ writing a new value to the hardware semaphore 102. In 

*^ ' * particular, disregarding for now the extending circuits 110 

SUMMARY 45 (discussed in detail later), if either the Set.i input 

114fl or the Set.i input 116fl is asserted (as a resuh of a "set" 

In accordance with the present invention, a single bit value of "1" being written by the Processor B or Processor 

semaphore circuit is provided. In accordance with the a, respectively), then the output 118 of the first OR device 

invention, each process that uses a particular single bit 106 is asserted to the "Set" input of the hardware semaphore 

semaphore has associated with it semaphore interface cir- 102. Similarly, if either the Clear.i input lUb or the Clear.i 

cuitry. input 116b is asserted (as a result of a "clear" value of "1" 

The hardware semaphore is one bit wide. A first hardware being written by the Processor B or Processor A, 

circuit detects one of the processes is writing a new value to respectively), then the output 120 of the second OR device 

the semaphore and forces the hardware semaphore to the 108 is asserted to the "Clear" input of the hardware sema- 

new value written. A plurality of second hardware circuits 55 phore 102. In the described embodiment, writing a value of 

are provided. Each second hardware circuit is associated "0" to the Set.i input 114a or 116a or to Dear.i input 1146 

with a separate one of the plurality of processes. Each of the or 1166 has no effect. 

particular second hardware circuit includes a detecting cir- Meanwhile, the OR device 122 detects whether either the 

cuit that detects the processor with which the particular Set.i input 114^ or the Clear.i input 1146 is asserted and, if 

second hardware circuit is associated is attempting to write go either are asserted, the output 126 of the OR device 122 is 

the new value to the semaphore; and means responsive to the asserted. The output 126 of the OR device 122 is connected 

detecting circuit that provides the current value of the to a clock input of a flip flop circuit 130. The output 126 of 

semaphore, before the write attempt, to an output of the the OR device 122 being asserted causes the previously held 

second particular hardware circuit. value in the hardware semaphore 102, provided from the Q.i 

In operation, a process writes to the Set.i bit or Clear.i bit 65 output of the hardware semaphore 102 to the D input of the 

of the set and clear circuitry, respectively, associated with flip flop circuit 130, to be provided to the Test.i output 134 

the process and then reads from the Test.i bit of the storage of the flip flop circuit 130. 
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In a similar manner, the OR device 124 detects whether value. (For example, the process executing in Processor A 

either the Set.i input 116a or the Clear.i input 1166 is would read the Test.i output 134.) 

asserted and, if either are asserted the output 128 of the OR jf ,he Jest.i bit read is one at step 204, this indicates that 

device 122 is asserted. TTie output 128 of the OR device 124 ^^^^ ^hat is. the Test.i bit being one indicates that 

IS connected to a clock input of a flip flop circuit 132. The 5 another process has set the semaphore but has not yet cleared 

output 128 of the OR device 124 being asserted causes the /^^^J,,^ .-^ a- * * *u » 1 .u 

previously held value in the hardwire semaphore 102, (Convention dictates that no process ever cleais the 

provided from the Q.i output ofthe hardware semaphore 102 semaphore unless that process was the one to set the 

10 the D input of the flip flop circuit 132. to be provided to semaphore.) In this case, step 202 is repealed. By contrast, 

the Test.i output 136 of the flip flop circuit 132. Tbat is. in test.i bit read is zero at step 204, then the lock is 

response to the detection, the value in the hardware scma- '° successful. That b the Test.i bit being zero mdicates that no 

phore 102 before the write is held at the Tfcst.i output 136. P"''*^ ^"^""^ "^'°e resource protected by the 

^. ^. , i_ J u • semaphore. In this case, the process uses the locked resource 

At any time, the value of the hardware semaphore 102 is , • . ^ u *u ^ u j i- -^nrx ^ . . mo *u 

•1 ui . n n A i • » (as indicated by the dashed line 206) and, at step 208, the 

available to Processor B and Processor A at Valu.i outputs ^ , ♦ *u ■ u* '*u ♦u 

t'io A tAt\ • J ♦ A ^AA *• 1 process writes a 1 to the Clear.i bit associated with the 

138 and 140, via drivers 142 and 144, respectively, *^ 

However, it should be noted that the Value. i outputs 138 and process. 

140 are not atomic with a test and set operation, ^ seen that the semaphore circuit 100 is particu- 

Now, the holding circuits 110 and 112, and the priority ^^'}y advantageous over many conveiitional semaphore cir- 

logic circuit 138 is discussed. In some situations, both cmts m that it blocks access only during a vmt^^ 

Processor B and Processor A may be attempting to write a 20 **V^^ * read/write operation, which is longer. For 

value to the semaphore circuit 102 substantially simulta- example in a synchronous system, this blockmg may only 

neously. By "substantiaUy", it is meant to be close enough ^^^^^^ u ^^''t'' '^'''^ ' conventional sema- 

to guarantee exclusivity of writing. For example, in a P^^'*^ mechanism, the blocking may take up to two biu> 

synchronous system, two writes within a certain portion of transactions (and, therefore, many more than two clock 

a clock cycle may be "substantially simultaneous." In an 25 ^y^^^s;. 

asynchronous system, two writes within a certain period as PIG. 3 is a block diagram that illustrates an eight-bit 

defined by signal delays within the circuitry of the system semaphore set 300 that is comprised of a set of semaphore 

may be "substantially simultaneous". (If the processors circuits 100-0 through 100-7. Each of semaphore circuits 

sharing the semaphore are all on the same bus, then the ^^-^ through 100-7 is similar to the semaphore circuit 100 

priority logic circuit 138 may simply be the bus arbiter,) In 30 illustrated in FIG. 1. Inputs and outputs to each semaphore 

this case, the priority logic circuit 138 detects the substan- circuit 100-0 through 100-7 are shown in a representative 

lially simultaneous write and asserts one of a first stall signal fashion for semaphore circuit 100-4. It is to be understood 

146 or a second stall signal 148, to the first holding circuit that the remaining semaphore circuits 100-0 through 100-3 

110 or the second holding circuit 112, respectively. The and ^^^^ through 100-7 have similar inputs and outputs as 

holding circuit (110 or 112) to which the stall signal (146 or 35 semaphore circuit 100-4. However, these inputs and outputs 

148) is asserted stalls the start of providing the particular are configured such that they are collectively active at the 

Set.i signal (114a or 116a) or Clear.i signal (114fe or 116i?) same time through parallel bus bits. As is discussed in detail 

associated with the holding circuit (110 or 112) to which the below with reference to the flowcharts of FIGS. 4 and 5. 

stall signal (146 or 148) is asserted. This stalling extends such a set 300 is useful a) multiple single bit semaphores or 

until the write conflict is resolved and preferably ends as 40 b) implementing counter process that is to be shared 

soon as possible thereafter. Alternately, the stall may be between processes (or that is to be substantially duplicated, 

implemented by recording the write (by the first processor, each duplicate for use by a different process), 

for example) and stalling a read of the test register (by the FIG. 4 is a flowchart that illustrates a counting sub- 

same first processor) if the read would otherwise come process (in particular, a "count up" process) that utilizes the 

before the write is completed. 45 set 300 of semaphore circuits as a unary counter from "0" to 

It is noted that FIG. 1 illustrates a semaphore circuit for "7". At step 402, the process reads the current value of the 

use by processes executing on two separate processors setusingtheValue.i outputs of the sets of semaphore circuits 

(Processor A and Processor B). These two separate proces- 300. At step 406, the process determines if all positions are 

sors may be physically located on a single shared bus. In this a "1", If so, then at step 408 the process ends, because the 

case, the priority logic circuit 138 may be the standard bus 50 maximum capacity of the counter has been reached, 

arbitration logic. Alternately, the two processors may be Otherwise, there is at least one position with a "0**, and the 

located on separate buses, and the priority logic circuit 138 process writes to the Set.i input of the first available sema- 

determines priority between the two buses. If the two buses phore circuit at step 410, (By convention, semaphore circuit 

work on a time sharing principle (e.g., one works on the 100-0 is in the first position and semaphore circuit 100-7 is 

rising edge of a clock and the other works on the falling edge 55 in the last position. The first available semaphore circuit is 

of the clock), then the priority logic 138 may not be the lowest position semaphore circuit that has a "0".) At step 

necessary, 412, the process reads from the Test.i output of the sema- 

HG. 2 is a flowchart that illustrates a process 200 to be P^ore circuit just set. If the Test.i output read is zero, this 

executed by a process that uses the semaphore circuit 100 indicates that the write was successfiil (analogous to the 

(with the convention being that a value of "1" indicates that 60 "successful lock" condition of step 206 in FIG. 2). In this 

the semaphore is owned and a value of "0" indicates that the case, processing ends at step 416. Otherwise, the process 

semaphore is not owned). At step 202, the process write a returns to step 404 to attempt again to increment the counter. 

"1" to the Set.i bit associated with the process. (For example, FIG. 5 is a flowchart that illustrates a "count down" 

referring back to FIG. 1, a process executing in Processor A process that utilizes the set 300 of semaphore circuits. The 

would write a "1" to the Set.i input 116a.) At step 204, the 65 "count down" process is complementary to the "count up** 

process 200 reads the Test.i bit associated with the process. process illustrated in FIG. 4. The process starts at step 502. 

The Test.! bit read is indicative of the previous semaphore At step 504, the process reads the current value of the set 
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using the Value. i outputs of the sets of semaphore circuits It should be understood that various alternatives to the 

300. At step 506, the position of the leading "1" is identified. embodiments of the invention described herein may be 

At step 508, it is detennined if there is in fact a "1" anywhere employed in practicing the invention. For example, a par- 

in the set. If not, then the minimum count has been reached ticular hardware semaphore may be shared by more than two 
and processing ends at step 510. Otherwise, the semaphore 5 processors (or processes executing on the same or different 

circuit at the position of the leading "1" is cleared by writing processore, where the processes may contend for a shared 

to the Clear.i input of that semaphore circuit. At step 514, the resource) by having multiple "ports" (i.e., more set, dear 

process reads the Test.i output of the semaphore circuit just ^^^^ registers) for the semaphore. Each process would 

cleared. If the Test.i output read is "1 , this indicates that the ^^^^ ^1^^^ registers), 

write was successful and processmg ends at step 518, However, a plurality of processes could share a set of por4 

Otherwise, processmg returns to step 504 to attempt agam to .r u l . j^. . ^. u • . 

decrement the counter ^ guaranteed that the processes sharing the set of 

w I- J- / c.t. ' u .1. i_ ports would not access the ports at the same time (e.g., if 

In one embodiment of the mvention where the semaphore . . c 

circuit is for sharing between two processors of a processor ^^^^ P/^^^ ^""^'^ P[^^^"^ P^^.*^^^.^ ^"^"^ TI" 

system, embodied in a single integrated circuit, three sets of ^"""S,^ semaphore access by m^^king interrupts). It 

4-bit semaphores are provided. These semaphores each have 1^ "^^^"^^^ ^f^^ follo^^ang claims define the scope of the 

eight associated registers that provide for setting, clearing i^venUon and that methods and apparatus within the scope 

and testing the value of each semaphore by each processor. ^f these claims and their equivalents be covered thereby. 

Each set of 4-bit semaphores have eight associated registers What is claimed is: 

that allow setting, clearing and testing the value of the 1- A multiprocessor system, including: 
semaphore by each of the processors. FIG. 6 illustrates all 20 a plurality of processors; 

the semaphore registers in the integrated circuit. It should be ^ hardware semaphore for a shared resource, the hardware 

noted that, in the context of this description, a "register" may semaphore being one bit wide; 

not include a meinory element but. rather, may just be a port ^ g^^^ ^^^^^^^ ^^^^ ^^^^ ^^^^^ processors 

having circmtry to perform an associated operation when • 1 * *i. u j r *u 

acces^ ^ writing a new value to the semaphore and forces the 

™ ' . . „.„ . hardware semaphore to the new value written; 

To set a bit-semaphore, a processor writes a "1 to the , ^ , . ^ 

respective bit in the CREnSSi register. Then, the CREnCTi ^ P"y hardware circuits, each second 

register may be read to indicate the value of the semaphore hardware cu-cuit associated with a separate one of the 

prior to the set operation. In case both the first processor and pluraHty of processors and each particular second hard- 

the second processor try to modify one of the semaphore sets 3Q circuit including: 

in the same clock cycle, the processor system gives priority * detecting circuit that detects the processor with which 

to the first processor while access by the second processor is the particular second hardware circuit is associated is 

extended by wait-states until the access by the first processor attempting to write the new value to the semaphore; 

is completed. means responsive to the detecting circuit that provides 

No processor should share registers since it is the use of a value of the semaphore, before the writing by one 

the registers and, more particularly, the associated circuitry of the processors, to an output of the second particu- 

that guarantees that a test and set operation is atomic. lar hardware circuit. 

Semaphore Value Read registers provide the current value 2. The multiprocessor system of claim 1, and further 

of the semaphore for each of the flags. These registers are comprising: 

most useful during debug and testing. These registers are ^ means for controlling writing to the first hardware circuit 

available to both the first processor and the second proces- by the plurality of processors such that only one of the 

sor. It should be noted that a read from these registers can not plurality of processors at a time can write a new value 

be used to guarantee that a following set or clear operation to the hardware semaphore. 

will succeed in capturing a semaphore. 3. The microprocessor system of claim 2, wherein the 

The CREnSSO-CREnSSS register formats (FIG. 6) are controlling means gives priority of writing to a predeter- 

shown in FIG. 7. These registers are byte-wide, read only. mined one of the plurality of processors. 

Each register can be read by the first processor only, and 4. A microprocessor system as in claim 2, and further 

holds the value of the semaphores prior to the last set or clear comprising: 

operation by that register. a common bus to which the plurality of processors are 

The CREnSCO-CREnSO register formats (FIG. 6) are 50 each connected wherein the controlling means consists 

shown in FIG. 8. These registers are byte -wide, write-only. of bus arbitration logic. 

Each register can be written by the first processor only. A 5. A microprocessor system as in claim 2, wherein the 

write of "1" to any of the bits clears the respective sema- processors are not connected to a common bus. 

phore, 6. The system of claim 1, wherein the output of the second 

The CREnST0-CREnST3 register formats (FIG. 6) are 55 particular hardware circuit is a first output, and wherein the 

shown in FIG. 9. These registers are byte- wide, read only. particular second hardware circuit further includes: 

They can be read by the first processor only, and holds the a second output, not responsive to the detecting circuit, at 

value of a semaphore at the moment it is read. which a current value of the hardware semaphore is 

The CREnSVO-CREnSVa register formats (FIG. 6) are provided, 
shown in FIG. 10. These registers are byte-wide, read-only. 60 7. A semaphore circuit for use with a plurality of 

They can be read by the first processor only, and holds the processes, including: 

value of a semaphore prior to the last write or clear opera- a hardware semaphore for a shared resource, the hardware 

tion. semaphore being one bit wide; 

A similar set of registers (EnCRSS0-EnCRSS3; a first hardware circuit that detects one of a plurality of 
ENCRSC0-ENCRSC3; EnCRST0-EnCRST3; and 65 processes is writing a new value to the semaphore and 
EnCRSV0-EnCRSV3) is provided for use by the second forces the hardware semaphore to the new value writ- 
processor, ten; 
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a plurality of second hardware circuits, each second 
hardware circuit associated with a separate one of the 
plurality of processes and each particular second hard- 
ware circuit including: 

a detecting circuit that detects the process with which 5 
the particular second hardware circuit is associated is 
attempting to write the new value to the semaphore; 

means responsive to the detecting circuit that provides 
a value of the semaphore, before the writing by one 
of the processes, to an output of the second particular lO 
hardware circuit. 

8. The semaphore circuit of claim 7, and further compris- 
ing: 

means for controlling wridag to the hardware by the 
plurality of processes such that only one of the plurality 15 
of processes at a time can write a new value to the 
hardware semaphore. 

9. The semaphore circuit of claim 8, wherein the control- 
ling means gives priority of writing to a predetermined one 
of the plurality of processes. 20 

10. The semaphore circuit of claim 7, wherein the output 
of the second particular hardware circuit is a first output, and 
wherein the particular second hardware circuit further 
includes: 

a second output, not responsive to the detecting circuit, at 
which a current value of the hardware semaphore is 

provided. 

U. A method of maintaining a one bit wide hardware 
semaphore, including: 
detecting that one of a plurality of processes is writing a 

new value to the semaphore and forcing the hardware 

semaphore to the new value written; 
detecting which of the plurality of process is attempting to 

write the value to the semaphore; 35 
providing a value of the semaphore, before the writing by 

one of the processors, to the one of the plurality of 

processes detected to be attempting to write the new 

value, and 

controlling writing to the semaphore by the plurality of ^ 
processes such that only one of the plurality of pro- 
cesses at a time can write a new value to the hardware 
semaphore, and 
wherein the controlling step gives priority of writing to a 

predetermined one of the plurality of processes. 
12. A circuit for use with a plurality of processes for 
maintaining a multi-bit value, including: 

a set of hardware semaphores, each hardware semaphore 
being one bit wide and for holding one bit of the 
multi-bit value; 
for each of the set of hardware semaphores, 
a first hardware circuit that detects one of a plurality of 
processes is writing a new bit of the multi>bit value 
to the semaphore and forces the hardware semaphore 
to the new value written; 
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a plurality of second hardware circuits, each second 
hardware circuit associated with a separate one of the 
plurality of processes and each particular second 
hardware circuit including: 
a detecting circuit that detects the process with which 
the particular second hardware circuit is associ- 
ated is attempting to write the new value of the 
multi-bit value to the semaphore; 
means responsive to the detecting circuit that pro- 
vides a value of the semaphore, before the writing 
by one of the processes, to a test output of the 
second particular hardware circuit, 
circuitry for presenting write attempt of each bit of the 
multi-bit value to each corresponding one hardware 
semaphore such that the multi-bit value is written to the 
set of hardware semaphores in parallel, and 
circuitry for presenting test outputs of the second hard- 
ware circuits in parallel. 

13. The semaphore circuit of claim 12, and fiirther com- 
prising: 

means for controlling writing to the set of hardware 
semaphores by the plurality of processes such that only 
one of the plurality of processes at a time can write a 
new value to the set of hardware semaphore. 

14. The semaphore circuit of claim 13, wherein the 
controlling means gives priority of writing to a predeter- 
mined one of the plurality of processes. 

15. The semaphore circuit of claim 12, wherein the output 
of each second hardware circuit is a first output, and wherein 
each second hardware circuit further includes: 

a second output, not responsive to the detecting circuit, at 
which the current value of the hardware semaphore is 
provided. 

16. A method of maintaining a one bit wide hardware 
semaphore, including: 

detecting that one of a plurality of processes is writing a 
new value to the semaphore and forcing the hardware 
semaphore to the new value written; 

detecting which of the plurality of process is attempting to 
write the new value to the semaphore; 

providing a value of the semaphore, before the writing by 
one of the processors, to the one of the plurality of 
processes detected to be attempting to write the new 
value; and 

controlling writing to the semaphore by the plurality of 
processes such that only one of the plurality of pro- 
cesses at a time can write a new value to the hardware 
semaphore, and 

wherein the controlling step is performed by bus arbitra- 
tion logic. 

♦ • ♦ ♦ ♦ 
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